Magnetic resonance (MR) and computer tomography (CT) images are two typical types of medical images that provide mutually-complementary information for accurate clinical diagnosis and treatment. However, obtaining both images may be limited due to some considerations such as cost, radiation dose and modality missing. Recently, medical image synthesis has aroused gaining research interest to cope with this limitation. In this paper, we propose a bidirectional learning model, denoted as dual contrast cycleGAN (DC-cycleGAN), to synthesize medical images from unpaired data. Specifically, a dual contrast loss is introduced into the discriminators to indirectly build constraints between real source and synthetic images by taking advantage of samples from the source domain as negative samples and enforce the synthetic images to fall far away from the source domain. In addition, cross-entropy and structural similarity index (SSIM) are integrated into the DC-cycleGAN in order to consider both the luminance and structure of samples when synthesizing images. The experimental results indicate that DC-cycleGAN is able to produce promising results as compared with other cycleGAN-based medical image synthesis methods such as cycleGAN, RegGAN, DualGAN, and NiceGAN. The code will be available at https://github.com/JiayuanWang-JW/DC-cycleGAN.
translated by 谷歌翻译
量化监督学习模型的不确定性在制定更可靠的预测方面发挥着重要作用。认知不确定性,通常是由于对模型的知识不足,可以通过收集更多数据或精炼学习模型来减少。在过去的几年里,学者提出了许多认识的不确定性处理技术,这些技术可以大致分为两类,即贝叶斯和集合。本文对过去五年来提供了对监督学习的认识性不确定性学习技术的全面综述。因此,我们首先,将认知不确定性分解为偏见和方差术语。然后,介绍了认知不确定性学习技术以及其代表模型的分层分类。此外,提出了几种应用,例如计算机视觉(CV)和自然语言处理(NLP),然后讨论研究差距和可能的未来研究方向。
translated by 谷歌翻译
广义零射击学习(GZSL)旨在培训一个模型,以在某些输出类别在监督学习过程中未知的情况下对数据样本进行分类。为了解决这一具有挑战性的任务,GZSL利用可见的(源)和看不见的(目标)类的语义信息来弥合所见类和看不见的类之间的差距。自引入以来,已经制定了许多GZSL模型。在这篇评论论文中,我们介绍了有关GZSL的全面评论。首先,我们提供了GZSL的概述,包括问题和挑战。然后,我们为GZSL方法介绍了分层分类,并讨论了每个类别中的代表性方法。此外,我们讨论了GZSL的可用基准数据集和应用程序,以及有关研究差距和未来研究方向的讨论。
translated by 谷歌翻译
Uncertainty quantification (UQ) plays a pivotal role in the reduction of uncertainties during both optimization and decision making, applied to solve a variety of real-world applications in science and engineering. Bayesian approximation and ensemble learning techniques are two of the most widely-used UQ methods in the literature. In this regard, researchers have proposed different UQ methods and examined their performance in a variety of applications such as computer vision (e.g., self-driving cars and object detection), image processing (e.g., image restoration), medical image analysis (e.g., medical image classification and segmentation), natural language processing (e.g., text classification, social media texts and recidivism risk-scoring), bioinformatics, etc. This study reviews recent advances in UQ methods used in deep learning, investigates the application of these methods in reinforcement learning, and highlight the fundamental research challenges and directions associated with the UQ field.
translated by 谷歌翻译
Coronary Computed Tomography Angiography (CCTA) provides information on the presence, extent, and severity of obstructive coronary artery disease. Large-scale clinical studies analyzing CCTA-derived metrics typically require ground-truth validation in the form of high-fidelity 3D intravascular imaging. However, manual rigid alignment of intravascular images to corresponding CCTA images is both time consuming and user-dependent. Moreover, intravascular modalities suffer from several non-rigid motion-induced distortions arising from distortions in the imaging catheter path. To address these issues, we here present a semi-automatic segmentation-based framework for both rigid and non-rigid matching of intravascular images to CCTA images. We formulate the problem in terms of finding the optimal \emph{virtual catheter path} that samples the CCTA data to recapitulate the coronary artery morphology found in the intravascular image. We validate our co-registration framework on a cohort of $n=40$ patients using bifurcation landmarks as ground truth for longitudinal and rotational registration. Our results indicate that our non-rigid registration significantly outperforms other co-registration approaches for luminal bifurcation alignment in both longitudinal (mean mismatch: 3.3 frames) and rotational directions (mean mismatch: 28.6 degrees). By providing a differentiable framework for automatic multi-modal intravascular data fusion, our developed co-registration modules significantly reduces the manual effort required to conduct large-scale multi-modal clinical studies while also providing a solid foundation for the development of machine learning-based co-registration approaches.
translated by 谷歌翻译
The robustness and accuracy of a vision system for motion estimation of a tumbling target satellite are enhanced by an adaptive Kalman filter. This allows a vision-guided robot to complete the grasping of the target even if occlusion occurs during the operation. A complete dynamics model, including aspects of orbital mechanics, is incorporated for accurate estimation. Based on the model, an adaptive Kalman filter is developed that estimates not only the system states but also all the model parameters such as the inertia ratio, center-of-mass, and the rotation of the principal axes of the target satellite. An experiment is conducted by using a robotic arm to move a satellite mockup according to orbital mechanics while the satellite pose is measured by a laser camera system. The measurements are sent to the Kalman filter, which, in turn, drives another robotic arm to grasp the target. The results demonstrate successful grasping even if the vision system is blocked for several seconds.
translated by 谷歌翻译
本文提出了一种控制操纵器系统,掌握刚体有效载荷的方法,因此,由于外部施加的力与另一个自由浮动的刚体(具有不同的惯性特性)相同,因此组合系统的运动与另一个相同。这允许在1-G实验室环境中测试下的缩放航天器原型的零G仿真。由运动反馈和力量/力矩反馈组成的控制器调整了测试航天器的运动,以匹配飞行航天器的运动,即使后者具有灵活的附属物(例如太阳能电池板),而前者则是刚性的。整体系统的稳定性进行了分析研究,结果表明,只要两个航天器的惯性特性不同,并且尊重有效载荷与操纵器的惯性比率的上行,则该系统保持稳定。还提出了重要的实际问题,例如校准和对传感器噪声和量化的敏感性分析。
translated by 谷歌翻译
虽然对多语言视觉语言预测的模型实现了一些好处,但是当将多句预训练的视力语言模型应用于非英语数据时,各种任务和语言的最新基准测试表明,跨语性概括不佳,并且在有监督之间存在很大的差距( )英语表现和(零射)跨语性转移。在这项工作中,我们探讨了这些模型在零拍的跨语性视觉响应(VQA)任务上的糟糕性能,其中模型在英语视觉问题数据上进行了微调,并对7种类型上多样的语言进行了评估。我们通过三种策略改善了跨语性转移:(1)我们引入了语言的先验目标,以增加基于相似性损失以指导模型在培训期间的跨渗透损失,(2)我们学习了一个特定于任务的子网络,改善跨语性概括并减少不修改模型的方差,(3)我们使用合成代码混合来扩大培训示例,以促进源和目标语言之间的嵌入。我们使用预审计的多语言多模式变压器UC2和M3P进行的XGQA实验证明了针对7种语言提出的微调策略的一致有效性,以稀疏模型优于现有的转移方法。复制我们发现的代码和数据已公开可用。
translated by 谷歌翻译
使用自适应机器学习解决了在不准确运动学模型的情况下,在存在不正确的运动学模型的情况下形成封闭运动链的合作操纵器的自我调整控制问题。两个级联估计器在线更新了与互连操纵器的相对位置/方向不确定性有关的运动学参数,以调整合作控制器,以通过最小值驱动力来实现准确的运动跟踪。该技术允许对所涉及的操纵器的相对运动学进行准确的校准,而无需高精度的终点传感或力测量,因此在经济上是合理的。研究整个实时估计器/控制器系统的稳定性表明,可以确保自适应控制过程的收敛性和稳定性,如果i)角速度向量的方向不会随着时间的推移而保持恒定;参数误差是由一些已知参数的缩放器函数上限。自适应控制器被证明是无奇异性的,即使控制定律涉及在估计参数下计算的矩阵的近似。实验结果证明了传统的反向动态控制方案对运动不准确的跟踪性能的敏感性,而自我调整合作控制器的跟踪误差显着降低。
translated by 谷歌翻译
本文着重于自适应和耐断层的视力引导的机器人系统,该系统可以选择最适合的控制动作,如果在短期内发生视觉系统的部分或完全失败。此外,自主机器人系统会考虑物理和操作约束,以执行特定视觉伺服任务的需求,以最小化成本功能。层次控制体系结构是基于迭代最接近点(ICP)图像登记的变体的交织集成,开发的,这是约束的噪声自适应卡尔曼滤波器,故障检测逻辑和恢复,以及受约束的最佳路径计划器。动态估计器估计运动预测所需的未知状态和不确定的参数,同时对估计过程的一致性施加了一组不平等约束,并在面对意外的视力错误时适应了Kalman滤波器参数。随后是基于故障检测逻辑实施故障恢复策略,该逻辑使用图像注册的度量拟合误差来监视视觉反馈的健康状况。随后,估计/预测的姿势和参数将传递给最佳路径计划器,以使机器人最终效应器尽快将移动目标的握把到达移动目标的抓地点目标的视线角。
translated by 谷歌翻译